Equatorial Guinea
10 captivating images from National Geographic's Photo Ark
Since 2006, the project has photographed 17,000 species in the world's zoos, aquariums, and wildlife sanctuaries. Photographs from the Photo Ark will be featured in the inaugural exhibition at the National Geographic Museum of Exploration in Washington D.C. Breakthroughs, discoveries, and DIY tips sent every weekday. A picture is said to be worth a thousand words, but some photographs are worth 17,000. Well, 17,000 species, that is. For's Photo Ark project, photographer Joel Sartore is documenting all species living in the world's zoos, aquariums, and wildlife sanctuaries.
1,500th discovered bat species is a 'god of the island'
Environment Animals Wildlife Bats 1,500th discovered bat species is a'god of the island' What better way to kick off Bat Appreciation Month? Breakthroughs, discoveries, and DIY tips sent every weekday. It's official: the world's 1,500th known bat species has been discovered in Equatorial Guinea. And as luck would have it, 's announcement is just in time for Bat Appreciation Month . Still, biologists estimate that bats have existed for at least 55 to 56 million years .
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition
Brookes, Otto, Kukushkin, Maksim, Mirmehdi, Majid, Stephens, Colleen, Dieguez, Paula, Hicks, Thurston C., Jones, Sorrel, Lee, Kevin, McCarthy, Maureen S., Meier, Amelia, Normand, Emmanuelle, Wessling, Erin G., Wittig, Roman M., Langergraber, Kevin, Zuberbühler, Klaus, Boesch, Lukas, Schmid, Thomas, Arandjelovic, Mimi, Kühl, Hjalmar, Burghardt, Tilo
Computer vision analysis of camera trap video footage is essential for wildlife conservation, as captured behaviours offer some of the earliest indicators of changes in population health. Recently, several high-impact animal behaviour datasets and methods have been introduced to encourage their use; however, the role of behaviour-correlated background information and its significant effect on out-of-distribution generalisation remain unexplored. In response, we present the PanAf-FGBG dataset, featuring 20 hours of wild chimpanzee behaviours, recorded at over 350 individual camera locations. Uniquely, it pairs every video with a chimpanzee (referred to as a foreground video) with a corresponding background video (with no chimpanzee) from the same camera location. We present two views of the dataset: one with overlapping camera locations and one with disjoint locations. This setup enables, for the first time, direct evaluation of in-distribution and out-of-distribution conditions, and for the impact of backgrounds on behaviour recognition models to be quantified. All clips come with rich behavioural annotations and metadata including unique camera IDs and detailed textual scene descriptions. Additionally, we establish several baselines and present a highly effective latent-space normalisation technique that boosts out-of-distribution performance by +5.42% mAP for convolutional and +3.75% mAP for transformer-based models. Finally, we provide an in-depth analysis on the role of backgrounds in out-of-distribution behaviour recognition, including the so far unexplored impact of background durations (i.e., the count of background frames within foreground videos).
Geo-Semantic-Parsing: AI-powered geoparsing by traversing semantic knowledge graphs
Nizzoli, Leonardo, Avvenuti, Marco, Tesconi, Maurizio, Cresci, Stefano
Online Social Networks (OSN) are privileged observation channels for understanding the geospatial facets of many real-world phenomena [1]. Unfortunately, in most cases OSN content lacks explicit and structured geographic information, as in the case of Twitter, where only a minimal fraction (1% to 4%) of messages are natively geotagged [2]. This shortage of explicit geographic information drastically limits the exploitation of OSN data in geospatial Decision Support Systems (DSS) [3]. Conversely, the prompt availability of geotagged content would empower existing systems and would open up the possibility to develop new and better geospatial services and applications [4, 5]. As a practical example of this kind, several social media-based systems have been proposed in recent years for mapping and visualizing situational information in the aftermath of mass disasters - a task dubbed as crisis mapping - in an effort to augment emergency response [6, 7]. These systems, however, demand geotagged data to be placed on crisis maps, which in turn imposes to perform the geoparsing task on the majority of social media content. Explicit geographic information is not only needed in early warning [8, 9] and emergency response systems [10, 11, 12, 13, 14], but also in systems and applications for improving event promotion [15, 16], touristic planning [17, 18, 19], healthcare accessibility [20], news aggregation [21] Post-print of the article published in Decision Support Systems 136, 2020. Please refer to the published version: doi.org/10.1016/j.dss.2020.113346
Unveiling AI's Threats to Child Protection: Regulatory efforts to Criminalize AI-Generated CSAM and Emerging Children's Rights Violations
Kokolaki, Emmanouela, Fragopoulou, Paraskevi
This paper aims to present new alarming trends in the field of child sexual abuse through imagery, as part of SafeLine's research activities in the field of cybercrime, child sexual abuse material and the protection of children's rights to safe online experiences. It focuses primarily on the phenomenon of AI-generated CSAM, sophisticated ways employed for its production which are discussed in dark web forums and the crucial role that the open-source AI models play in the evolution of this overwhelming phenomenon. The paper's main contribution is a correlation analysis between the hotline's reports and domain names identified in dark web forums, where users' discussions focus on exchanging information specifically related to the generation of AI-CSAM. The objective was to reveal the close connection of clear net and dark web content, which was accomplished through the use of the ATLAS dataset of the Voyager system. Furthermore, through the analysis of a set of posts' content drilled from the above dataset, valuable conclusions on forum members' techniques employed for the production of AI-generated CSAM are also drawn, while users' views on this type of content and routes followed in order to overcome technological barriers set with the aim of preventing malicious purposes are also presented. As the ultimate contribution of this research, an overview of the current legislative developments in all country members of the INHOPE organization and the issues arising in the process of regulating the AI- CSAM is presented, shedding light in the legal challenges regarding the regulation and limitation of the phenomenon.
What is in a name? Mitigating Name Bias in Text Embeddings via Anonymization
Manchanda, Sahil, Shivaswamy, Pannaga
Text-embedding models often exhibit biases arising from the data on which they are trained. In this paper, we examine a hitherto unexplored bias in text-embeddings: bias arising from the presence of $\textit{names}$ such as persons, locations, organizations etc. in the text. Our study shows how the presence of $\textit{name-bias}$ in text-embedding models can potentially lead to erroneous conclusions in assessment of thematic similarity.Text-embeddings can mistakenly indicate similarity between texts based on names in the text, even when their actual semantic content has no similarity or indicate dissimilarity simply because of the names in the text even when the texts match semantically. We first demonstrate the presence of name bias in different text-embedding models and then propose $\textit{text-anonymization}$ during inference which involves removing references to names, while preserving the core theme of the text. The efficacy of the anonymization approach is demonstrated on two downstream NLP tasks, achieving significant performance gains. Our simple and training-optimization-free approach offers a practical and easily implementable solution to mitigate name bias.
CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries
Liu, Shudong, Jin, Yiqiao, Li, Cheng, Wong, Derek F., Wen, Qingsong, Sun, Lichao, Chen, Haipeng, Xie, Xing, Wang, Jindong
Vision-language models (VLMs) have advanced human-AI interaction but struggle with cultural understanding, often misinterpreting symbols, gestures, and artifacts due to biases in predominantly Western-centric training data. In this paper, we construct CultureVerse, a large-scale multimodal benchmark covering 19, 682 cultural concepts, 188 countries/regions, 15 cultural concepts, and 3 question types, with the aim of characterizing and improving VLMs' multicultural understanding capabilities. Then, we propose CultureVLM, a series of VLMs fine-tuned on our dataset to achieve significant performance improvement in cultural understanding. Our evaluation of 16 models reveals significant disparities, with a stronger performance in Western concepts and weaker results in African and Asian contexts. Fine-tuning on our CultureVerse enhances cultural perception, demonstrating cross-cultural, cross-continent, and cross-dataset generalization without sacrificing performance on models' general VLM benchmarks. We further present insights on cultural generalization and forgetting. We hope that this work could lay the foundation for more equitable and culturally aware multimodal AI systems.
NoisyEQA: Benchmarking Embodied Question Answering Against Noisy Queries
Wu, Tao, Zhou, Chuhao, Wong, Yen Heng, Gu, Lin, Yang, Jianfei
The rapid advancement of Vision-Language Models (VLMs) has significantly advanced the development of Embodied Question Answering (EQA), enhancing agents' abilities in language understanding and reasoning within complex and realistic scenarios. However, EQA in real-world scenarios remains challenging, as human-posed questions often contain noise that can interfere with an agent's exploration and response, bringing challenges especially for language beginners and non-expert users. To address this, we introduce a NoisyEQA benchmark designed to evaluate an agent's ability to recognize and correct noisy questions. This benchmark introduces four common types of noise found in real-world applications: Latent Hallucination Noise, Memory Noise, Perception Noise, and Semantic Noise generated through an automated dataset creation framework. Additionally, we also propose a 'Self-Correction' prompting mechanism and a new evaluation metric to enhance and measure both noise detection capability and answer quality. Our comprehensive evaluation reveals that current EQA agents often struggle to detect noise in questions, leading to responses that frequently contain erroneous information. Through our Self-Correct Prompting mechanism, we can effectively improve the accuracy of agent answers.